Comparing genomes with rearrangements and segmental duplications

نویسندگان

  • Mingfu Shao
  • Bernard M. E. Moret
چکیده

MOTIVATION Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutionary history for given modern genomes, a task for which many algorithms have been proposed under various constraints. Algorithms that can handle both rearrangements and content-modifying events such as duplications and losses remain few and limited in their applicability. RESULTS We study the comparison of two genomes under a model including general rearrangements (through double-cut-and-join) and segmental duplications. We formulate the comparison as an optimization problem and describe an exact algorithm to solve it by using an integer linear program. We also devise a sufficient condition and an efficient algorithm to identify optimal substructures, which can simplify the problem while preserving optimality. Using the optimal substructures with the integer linear program (ILP) formulation yields a practical and exact algorithm to solve the problem. We then apply our algorithm to assign in-paralogs and orthologs (a necessary step in handling duplications) and compare its performance with that of the state-of-the-art method MSOAR, using both simulations and real data. On simulated datasets, our method outperforms MSOAR by a significant margin, and on five well-annotated species, MSOAR achieves high accuracy, yet our method performs slightly better on each of the 10 pairwise comparisons. AVAILABILITY AND IMPLEMENTATION http://lcbb.epfl.ch/softwares/coser.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enrichment of segmental duplications in regions of breaks of synteny between the human and mouse genomes suggest their involvement in evolutionary rearrangements.

The sequence of the mouse genome allows one to compare the conservation of synteny between the human and mouse genome and exploration of regions that might have been involved in major rearrangements during the evolution of these two species (evolutionary genome rearrangements). Recent segmental duplications (or duplicons) are paralogous DNA sequences with high sequence identity that account for...

متن کامل

I-44: Mutagenesis during Embryogenesis

We developed several novel tools to genome wide screen for CNVs and SNPs in single cells. When applied to cleavage stage embryos from young fertile couples we discovered, unexpectedly, an extremely high incidence of chromosomal instability, a hallmark of tumorigenesis (Vanneste et al., Nature Medicine, 2009; Vanneste et al., Hum.Reprod., 2011). Not only mosaicisms for whole chromosome aneuploid...

متن کامل

A High-Resolution Map of Synteny Disruptions in Gibbon and Human Genomes

Gibbons are part of the same superfamily (Hominoidea) as humans and great apes, but their karyotype has diverged faster from the common hominoid ancestor. At least 24 major chromosome rearrangements are required to convert the presumed ancestral karyotype of gibbons into that of the hominoid ancestor. Up to 28 additional rearrangements distinguish the various living species from the common gibb...

متن کامل

A Survey of Innovation through Duplication in the Reduced Genomes of Twelve Parasites

We characterize the prevalence, distribution, divergence, and putative functions of detectable two-copy paralogs and segmental duplications in the Apicomplexa, a phylum of parasitic protists. Apicomplexans are mostly obligate intracellular parasites responsible for human and animal diseases (e.g. malaria and toxoplasmosis). Gene loss is a major force in the phylum. Genomes are small and protein...

متن کامل

Recovering genome rearrangements in the mammalian phylogeny.

The analysis of genome rearrangements provides a global view on the evolution of a set of related species. We present a new algorithm called EMRAE (efficient method to recover ancestral events) to reliably predict a wide-range of rearrangement events in the ancestry of a group of species. Using simulated data sets, we show that EMRAE achieves comparable sensitivity but significantly higher spec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 31 12  شماره 

صفحات  -

تاریخ انتشار 2015